Picture for Yelong Shen

Yelong Shen

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Add code
Apr 30, 2025
Viaarxiv icon

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Add code
Apr 29, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Add code
Jan 23, 2025
Figure 1 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 2 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 3 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 4 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Viaarxiv icon

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation

Add code
Dec 17, 2024
Viaarxiv icon

Mojito: Motion Trajectory and Intensity Control for Video Generation

Add code
Dec 12, 2024
Figure 1 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 2 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 3 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Figure 4 for Mojito: Motion Trajectory and Intensity Control for Video Generation
Viaarxiv icon

StreamAdapter: Efficient Test Time Adaptation from Contextual Streams

Add code
Nov 14, 2024
Figure 1 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 2 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 3 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Figure 4 for StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Viaarxiv icon

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Add code
Oct 15, 2024
Figure 1 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 2 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 3 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Figure 4 for MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
Viaarxiv icon

Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation

Add code
Oct 14, 2024
Viaarxiv icon

LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

Add code
Oct 04, 2024
Figure 1 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 2 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 3 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Figure 4 for LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Viaarxiv icon